Terminology Extraction for Academic Slovene Using Sketch Engine

نویسندگان

  • Darja Fiser
  • Vit Suchomel
  • Milos Jakubícek
چکیده

In this paper we present the development of the terminology extraction module for Slovene which was framed within the Sketch Engine corpus management system and motivated by the KAS research project on resources and tools for analysing academic Slovene. We describe the formalism used for defining the grammaticality of terms as well as the calculation of the score of individual terms, give an overview of the definition of the term grammar for Slovene and evaluate it on a Slovene KAS corpus of academic Slovene.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bilingual Terminology Extraction in Sketch Engine

We present a method of bilingual terminology extraction from parallel corpora and a few heuristics and experiments with improving the performance of the basic variant of the method. An evaluation is given using a small gold standard manually prepared for EnglishCzech language pair from DGT translation memory [1]. The bilingual terminology extraction (ABTE3) is available for several languages in...

متن کامل

Slovene Word Sketches

Word sketches are one-page automatic, corpus-based summaries of a word's grammatical and collocational behaviour. They were first used in the production of the Macmillan English Dictionary (Rundell 2002). At that point, they only existed for English. Today, the Sketch Engine is available, a corpus tool which takes as input a corpus of any language and corresponding grammar patterns and which ge...

متن کامل

Extracting Academic Subjects Semantic Relations Using Collocations

The paper presents approach to analyze semantic content of academic subjects and its internal relations using statistically-based techniques for collocation extraction from large electronic educational text corpus. It offers a survey and analysis of some related corpus-based approaches to extract conceptual relations used for educational purpose and presents a technique for semantic search of c...

متن کامل

GDEX for Slovene

Good Dictionary Examples or GDEX is a tool in the Sketch Engine designed to help lexicographers with identifying dictionary examples by ranking sentences according to how likely they are to be good candidates. The ranking is done automatically using various syntactic and lexical features. So far, only GDEX for English has been available. This paper presents the design and evaluation of Slovene ...

متن کامل

Pattern-based Word Sketches for the Extraction of Semantic Relations

Despite advances in computer technology, terminologists still tend to rely on manual work to extract all the semantic information that they need for the description of specialized concepts. In this paper we propose the creation of new word sketches in Sketch Engine for the extraction of semantic relations. Following a pattern-based approach, new sketch grammars are developed in order to extract...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016